th骨海星(COTS)爆发是大屏障礁(GBR)珊瑚损失的主要原因,并且正在进行实质性的监视和控制计划,以将COTS人群管理至生态可持续的水平。在本文中,我们在边缘设备上介绍了基于水下的水下数据收集和策展系统,以进行COTS监视。特别是,我们利用了基于深度学习的对象检测技术的功能,并提出了一种资源有效的COTS检测器,该检测器在边缘设备上执行检测推断,以帮助海上专家在数据收集阶段进行COTS识别。初步结果表明,可以将改善计算效率的几种策略(例如,批处理处理,帧跳过,模型输入大小)组合在一起,以在Edge硬件上运行拟议的检测模型,资源消耗较低,信息损失较低。
translated by 谷歌翻译
荆棘冠的海星(婴儿床)爆发是珊瑚损失的主要原因是巨大的障碍礁(GBR),并且正在进行大量监测和控制计划,以试图管理生态可持续水平的COTS群体。我们释放了GBR上的COTS爆发区域的大规模注释的水下图像数据集,以鼓励机器学习和AI驱动技术的研究,以改善珊瑚礁秤上的COTS群体的检测,监测和管理。该数据集发布并托管在一次竞争中,挑战国际机器学习界,并从这些水下图像中的COTS检测的任务挑战。
translated by 谷歌翻译
As the accuracy of machine learning models increases at a fast rate, so does their demand for energy and compute resources. On a low level, the major part of these resources is consumed by data movement between different memory units. Modern hardware architectures contain a form of fast memory (e.g., cache, registers), which is small, and a slow memory (e.g., DRAM), which is larger but expensive to access. We can only process data that is stored in fast memory, which incurs data movement (input/output-operations, or I/Os) between the two units. In this paper, we provide a rigorous theoretical analysis of the I/Os needed in sparse feedforward neural network (FFNN) inference. We establish bounds that determine the optimal number of I/Os up to a factor of 2 and present a method that uses a number of I/Os within that range. Much of the I/O-complexity is determined by a few high-level properties of the FFNN (number of inputs, outputs, neurons, and connections), but if we want to get closer to the exact lower bound, the instance-specific sparsity patterns need to be considered. Departing from the 2-optimal computation strategy, we show how to reduce the number of I/Os further with simulated annealing. Complementing this result, we provide an algorithm that constructively generates networks with maximum I/O-efficiency for inference. We test the algorithms and empirically verify our theoretical and algorithmic contributions. In our experiments on real hardware we observe speedups of up to 45$\times$ relative to the standard way of performing inference.
translated by 谷歌翻译
In this paper a global reactive motion planning framework for robotic manipulators in complex dynamic environments is presented. In particular, the circular field predictions (CFP) planner from Becker et al. (2021) is extended to ensure obstacle avoidance of the whole structure of a robotic manipulator. Towards this end, a motion planning framework is developed that leverages global information about promising avoidance directions from arbitrary configuration space motion planners, resulting in improved global trajectories while reactively avoiding dynamic obstacles and decreasing the required computational power. The resulting motion planning framework is tested in multiple simulations with complex and dynamic obstacles and demonstrates great potential compared to existing motion planning approaches.
translated by 谷歌翻译
Federated Learning (FL) is a scheme for collaboratively training Deep Neural Networks (DNNs) with multiple data sources from different clients. Instead of sharing the data, each client trains the model locally, resulting in improved privacy. However, recently so-called targeted poisoning attacks have been proposed that allow individual clients to inject a backdoor into the trained model. Existing defenses against these backdoor attacks either rely on techniques like Differential Privacy to mitigate the backdoor, or analyze the weights of the individual models and apply outlier detection methods that restricts these defenses to certain data distributions. However, adding noise to the models' parameters or excluding benign outliers might also reduce the accuracy of the collaboratively trained model. Additionally, allowing the server to inspect the clients' models creates a privacy risk due to existing knowledge extraction methods. We propose CrowdGuard, a model filtering defense, that mitigates backdoor attacks by leveraging the clients' data to analyze the individual models before the aggregation. To prevent data leaks, the server sends the individual models to secure enclaves, running in client-located Trusted Execution Environments. To effectively distinguish benign and poisoned models, even if the data of different clients are not independently and identically distributed (non-IID), we introduce a novel metric called HLBIM to analyze the outputs of the DNN's hidden layers. We show that the applied significance-based detection algorithm combined can effectively detect poisoned models, even in non-IID scenarios. We show in our extensive evaluation that CrowdGuard can effectively mitigate targeted poisoning attacks and achieve in various scenarios a True-Positive-Rate of 100% and a True-Negative-Rate of 100%.
translated by 谷歌翻译
安全关键系统通常在调试之前进行危害分析,以识别和分析操作过程中可能出现的潜在危险系统状态。当前,危害分析主要基于人类的推理,过去的经验以及清单和电子表格等简单工具。增加系统复杂性使这种方法非常合适。此外,由于高成本或身体缺陷的危险,基于测试的危害分析通常不适合。对此进行的补救措施是基于模型的危害分析方法,这些方法依赖于正式模型或模拟模型,每个模型都具有自己的好处和缺点。本文提出了一种两层方法,该方法使用正式方法与使用模拟的详细分析结合了详尽分析的好处。首先使用监督控制理论从系统的形式模型中合成了导致不安全状态的不安全行为。结果是输入到模拟的输入,在该模拟中,使用域特异性风险指标进行了详细的分析。尽管提出的方法通常适用,但本文证明了该方法对工业人类机器人协作系统的好处。
translated by 谷歌翻译
图形数据库(GDB)启用对非结构化,复杂,丰富且通常庞大的图形数据集的处理和分析。尽管GDB在学术界和行业中都具有很大的意义,但几乎没有努力将它们与图形神经网络(GNNS)的预测能力融为一体。在这项工作中,我们展示了如何无缝将几乎所有GNN模型与GDB的计算功能相结合。为此,我们观察到这些系统大多数是基于或支持的,称为标记的属性图(LPG)的图形数据模型,在该模型中,顶点和边缘可以任意复杂的标签和属性集。然后,我们开发LPG2VEC,这是一种编码器,将任意LPG数据集转换为可以与广泛的GNN类直接使用的表示形式,包括卷积,注意力,消息通话,甚至高阶或频谱模型。在我们的评估中,我们表明,LPG2VEC可以正确保留代表LPG标签和属性的丰富信息,并且与与图形相比,与与图形相比,它提高了预测的准确性,而不管有针对性的学习任务或使用过的GNN模型,多达34%没有LPG标签/属性。通常,LPG2VEC可以将最强大的GNN的预测能力与LPG模型中编码的全部信息范围相结合,为神经图数据库铺平了道路,这是一类系统,其中维护的数据的绝大复杂性将从现代和未来中受益图机学习方法。
translated by 谷歌翻译
成倍增长的模型大小驱动了深度学习的持续成功,但它带来了过度的计算和记忆成本。从算法的角度来看,已经研究了模型的稀疏和量化以减轻问题。从体系结构的角度来看,硬件供应商提供了张量核心以进行加速。但是,由于严格的数据布局要求以及缺乏有效操纵低精度整数的支持,因此从稀疏的低精度矩阵操作中获得实践加速非常具有挑战性。我们提出了Magicube,这是一个高性能的稀疏矩阵库,用于张量芯上的低精度整数。 Magicube支持SPMM和SDDMM,这是深度学习的两个主要稀疏操作。 NVIDIA A100 GPU的实验结果表明,Magicube平均在供应商优化的库中平均达到1.44倍(高达2.37倍)的速度,用于稀疏内核,而在最先进的艺术品上进行了1.43倍的速度,具有可比的准确性。端到端稀疏变压器推断。
translated by 谷歌翻译
许多微体系式优化为深度神经网络解锁了巨大的处理能力,从而促进了AI革命。随着这种优化的精疲力尽,现代AI的增长现在是通过培训系统的性能,尤其是其数据流动的。我们没有专注于单个加速器,而是研究了全系统规模的大规模培训的数据移动特征。基于我们的工作量分析,我们设计了HammingMesh,这是一种新颖的网络拓扑,以低成本提供高的带宽,并具有很高的工作计划灵活性。具体而言,HammingMesh可以支持具有两个并行性的两个维度的深度学习培训工作的完整带宽和隔离。此外,它还为通用流量的高全球带宽提供支持。因此,HammingMesh将为未来的大规模深度学习系统供电,并具有极端的带宽要求。
translated by 谷歌翻译
我们提出了一种在线和数据驱动的不确定性量化方法,以实现安全的人类机器人协作应用程序的开发。安全性和系统的风险评估与测量的准确性密切相关:通常无法通过已知模型直接访问独特的参数,因此必须测量。但是,由于传感器的性能有限,甚至未知的环境干扰或人类,测量值通常会遭受不确定性的影响。在这项工作中,我们通过利用具有定量的,系统特定属性的保护措施来量化这些测量不确定性,这些措施会随时间,空间或其他状态空间维度恒定。我们方法的关键思想在于在运行时间参考保护方程式期间对传入数据的直接数据评估。特别是,我们估计违反已知的域名特定域保护特性的行为,并将其视为测量不确定性的结果。我们在人类机器人协作的背景下验证了用例验证我们的方法,从而强调了我们在现实环境下(例如在工业环境中)成功开发安全机器人系统的贡献的重要性。此外,我们还展示了如何将获得的不确定性值直接映射到任意安全限制(例如ISO 13849),该限制允许在运行时监视符合安全标准的符合性。
translated by 谷歌翻译